
The flower dataset contains 4242 images of flowers. The data collection is based on the data flicr, google images, yandex images. The pictures are divided into five classes: chamomile, tulip, rose, sunflower, dandelion. For each class there are about 800 photos. Photos are not high resolution, about 320x240 pixels. Photos are not reduced to a single size, they have different proportions!
This dataset can be found on https://www.kaggle.com/alxmamaev/flowers-recognition
The aim of this project is to classify a photo of a plant to the correct plant species. The image classification is done using Deep Learning.
In this project Keras is used to build a convolutional neural network for image classification.
The following layers will be added to the CNN:
-Convolutional Layer
-Pooling Layer
-Flattening Layer
-Neural Network
Arichitecture of a CNN - Source: https://www.mathworks.com/videos/introduction-to-deep-learning-what-are-convolutional-neural-networks--1489512765771.html
This Jupyter notebook was run on a Google Cloud VM with GPU.
#importing libraries
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import shutil
import warnings
import tensorflow as tf
import imageio
from pathlib import Path
from sklearn.model_selection import train_test_split
from keras.preprocessing.image import ImageDataGenerator
from sklearn.preprocessing import LabelEncoder
warnings.simplefilter('ignore')
%matplotlib inline
All pictures are going to be packed together with the category in a Dataframe.
# Path of the Images
input_path = './data/flowers'
# Get all the species
flower_types = os.listdir(input_path)
print("Number of flower species: ", len(flower_types))
print("Flower species: ", flower_types)
flowers = []
for species in flower_types:
# Get all the file names
all_flowers = os.listdir(input_path +'/'+ species)
# Add them to the list
for flower in all_flowers:
flowers.append((species, str(input_path +'/'+species) + '/' + flower))
# Build a dataframe
df_flower = pd.DataFrame(data=flowers, columns=['category', 'image'], index=None)
df_flower.head()
df_flower['category'].value_counts()
df_flower['category'].value_counts().plot(kind="bar")
plt.ylabel('count')
As we can see in the above barplot, all plants have about the same number of pictures.
In addition, all images are checked to ensure they are in jpg format.
for index, cell in df_flower['image'].iteritems():
if(cell.split(".")[2]!="jpg"):
print("Cell "+ str(index) + " contains a file which is not in jpg format. File: " + cell)
There are some Python files (not sure why), which will be deleted. Also the dataframe is going to be recreated, since these python files are in it.
if os.path.exists("./flowers/dandelion/flickr.py"):
os.remove("./flowers/dandelion/flickr.py")
if os.path.exists("./flowers/dandelion/run_me.py"):
os.remove("./flowers/dandelion/run_me.py")
if os.path.exists("./flowers/dandelion/flickr.pyc"):
os.remove("./flowers/dandelion/flickr.pyc")
# Get all the species
flower_types = os.listdir(input_path)
flowers = []
for species in flower_types:
# Get all the file names
all_flowers = os.listdir(input_path +'/'+ species)
# Add them to the list
for flower in all_flowers:
flowers.append((species, str(input_path +'/'+species) + '/' + flower))
# Build a dataframe
df_flower = pd.DataFrame(data=flowers, columns=['category', 'image'], index=None)
In addition, the dimensions of the images will be checked.
height=[]
width=[]
colors=[]
for img in df_flower['image']:
height.append(imageio.imread(img).shape[0])
width.append(imageio.imread(img).shape[1])
colors.append(imageio.imread(img).shape[2])
print("Minimum height: ", min(height)," Minimum width: ",min(width)," Minimum Colors: ", min(colors))
plt.hist(width, bins=100)
plt.title("Histogram from width")
plt.xlabel("width")
plt.ylabel("count")
plt.show()
plt.hist(height, bins=100)
plt.title("Histogram from height")
plt.xlabel("width")
plt.ylabel("count")
plt.show()
These histograms are important to determine the input shape from the images.
Finally let's have a look at some flowers. The following images are randomly selected. Each time the cell is executed, different images appear.
#creating random_sample array of imagepath which contains 5 images of each species
random_samples=[]
for species in flower_types:
for sample in df_flower['image'].where(df_flower['category']==species).dropna().sample(5).values:
random_samples.append(sample)
#ploting all the random samples
counter=0
fig,ax=plt.subplots(5,5)
fig.set_size_inches(15,15)
for i in range(5):
for j in range (5):
img= mpimg.imread(random_samples[counter],1)
ax[i,j].imshow(img,cmap=None)
ax[i,j].set_title('Flower: '+random_samples[counter].split("/")[2])
counter+=1
plt.tight_layout()
Lovely. The images are going to be split in test and train sets. The size of the training set is 90%.
#Creating Train / Test folders
train_dir = '.data/train/'
test_dir = '.data/test/'
if not os.path.exists(train_dir):
os.makedirs(train_dir)
if not os.path.exists(test_dir):
os.makedirs(test_dir)
for species in flower_types:
if not os.path.exists(train_dir+species):
os.makedirs(train_dir+species)
os.makedirs(test_dir+species)
all_species_images = df_flower['image'].where(df_flower['category']==species).dropna()
y = df_flower['category'].where(df_flower['category']==species).dropna()
X = df_flower['image'].where(df_flower['category']==species).dropna()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)
for img in X_train:
shutil.copy(img, ".data/train/"+species)
for img in X_test:
shutil.copy(img, ".data/test/"+species)
First a simple model is created and trained. Then a more complicated model is trained and compared to the simple model.
Since it is much faster to train and test a model on a GPU than on a CPU, the Tensorflow backend for Keras is going to be running on a GPU.
According to the Keras Documentation:
So the first step is to ensure that a GPU is available.
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
Since a GPU is available Keras should be using TensorFlow on it.
import keras
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dropout
from keras.layers import Dense
from keras.models import Model
from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
# Initialising the CNN
classifier = Sequential()
[1]The first step of a CNN is the Convolution layer.
Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved feature
The objective of the convolution operation is to extract context and high-level features such as edges, from the input image. Also it helps to eliminate unnecessary informations and reduce dimensionality. The result of this layer are the Feature maps.
# Step 1 - Convolution
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))
#32 feature detectors with the size of 3x3 will be applied on the input RGB images with size of 64x64
[2]This layer is quite similar to the Convolution Layer. It stores the largest number from a feature map to a pooled feature map.
By pooling these features it stores highly important parts from the feature maps. Furthermore the size will be decreased and this step also helps to reduce overfitting.
# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))
#a 2x2 pooling map will be applied to the feature maps
[3]The features map will be flattened into one single vector.
No spatial information will be lost, since the pooled the information is from the pooled feature maps, which has the context in it. The final numbers in the vectors, are the maximum numbers, which the feature detector had extracted.
# Step 3 - Flattening
classifier.add(Flatten())
To classify the images, an artificial neural network will be used. It takes the flattening layer as input and the output will be one of the five categories. In the neural network the two following activation functions will be used:
[4]ReLU is the most commonly used activation function in neural networks, especially in CNNs.
[5]The Softmax function is mainly used for mulitclass categorical classification.
# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 5, activation = 'softmax'))
# Compiling the CNN
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
classifier.summary()
Keras has ImageDataGenerator class which allows the users to perform image augmentation on the fly in a very easy way, so that the model would never see twice the exact same picture. This helps prevent overfitting and helps the model generalize better.
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('.data/train',
target_size = (64, 64),
batch_size = 32,)
test_set = test_datagen.flow_from_directory('.data/test',
target_size = (64, 64),
batch_size = 32,)
classifier.fit_generator(training_set,
steps_per_epoch = 3888,
epochs = 10,
validation_data = test_set,
validation_steps = 435)
The accuracy from this simple model is around 68% which is quite impressive. But still, the accuracy can be increased with a more complex model. Additionaly, this model has some serious overfitting problem.
[6]VGG16 was publised in 2014 and is one of the simplest. It is a convolutional neural network model proposed by K. Simonyan and A. Zisserman from the University of Oxford in the paper “Very Deep Convolutional Networks for Large-Scale Image Recognition”. The model achieves 92.7% top-5 test accuracy in ImageNet, which is a dataset of over 14 million images belonging to 1000 classes. It's Key Characteristics are:
The VGG16 architecture is given below:

A pretrained VGG16 model can be downloaded from Keras. Unfortunately thre pretrained weights can not be used, since the input shape in this case is different. According to the Documentation, the weights will be randomly initialized. In addition to two dropout layers will be included in the model, to reduce overfitting and increase accuracy.
vgg16_model = keras.applications.vgg16.VGG16(input_shape=(80, 80, 3),classes=5, weights=None)
# Store the fully connected layers
fc1 = vgg16_model.layers[-3]
fc2 = vgg16_model.layers[-2]
predictions = vgg16_model.layers[-1]
# Create the dropout layers
dropout1 = Dropout(0.4)
dropout2 = Dropout(0.4)
# Reconnect the layers
model = Sequential()
for layer in vgg16_model.layers[:-3]:
model.add(layer)
model.add(fc1)
model.add(dropout1)
model.add(fc2)
model.add(dropout2)
model.add(predictions)
model.summary()
The following steps have been taken to increase the accuracy from the basis VGG16 model with five training epochs (~70% accuracy):
Each of these steps came with the downside of longer calculation time.
model.compile(keras.optimizers.Adam(lr=0.0001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('./data/train',
target_size = (80, 80),
batch_size = 16)
test_set = test_datagen.flow_from_directory('./data/test',
target_size = (80, 80),
batch_size = 16)
history = model.fit_generator(training_set,
steps_per_epoch = 3888,
epochs = 50,
validation_data = test_set,
validation_steps = 435)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epochs')
plt.legend(['train', 'test'])
plt.show()
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epochs')
plt.legend(['train', 'test'])
plt.show()
Furthermore the model still can be improved:
vgg16_model = keras.applications.vgg16.VGG16(input_shape=(150, 150, 3),classes=5, weights=None)
# Store the fully connected layers
fc1 = vgg16_model.layers[-3]
fc2 = vgg16_model.layers[-2]
predictions = vgg16_model.layers[-1]
# Create the dropout layers
dropout1 = Dropout(0.4)
dropout2 = Dropout(0.4)
# Reconnect the layers
model2 = Sequential()
for layer in vgg16_model.layers[:-3]:
model2.add(layer)
model2.add(fc1)
model2.add(dropout1)
model2.add(fc2)
model2.add(dropout2)
model2.add(predictions)
model2.compile(keras.optimizers.Adam(lr=0.0001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
from keras.callbacks import ReduceLROnPlateau
learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc',
patience=3,
verbose=1,
factor=0.5,
min_lr=0.000001)
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('./data/train',
target_size = (150, 150),
batch_size = 16)
test_set = test_datagen.flow_from_directory('./data/test',
target_size = (150, 150),
batch_size = 16)
learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc', patience=3, verbose=1, factor=0.5, min_lr=2.5e-5)
history = model2.fit_generator(training_set,
steps_per_epoch = 3888 //16,
epochs = 100,
validation_data = test_set,
validation_steps = 435 // 16,
callbacks=[learning_rate_reduction])
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epochs')
plt.legend(['train', 'test'])
plt.show()
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epochs')
plt.legend(['train', 'test'])
plt.show()
As it can bee seen, the newer version of the VGG16 model didn't improved and has the same model accuracy. Therefore a new model will be tried. The new model is the ResNet50.
from keras.applications.resnet50 import ResNet50
resnet50_model = ResNet50(include_top=True, weights=None, input_shape=(150, 150, 3), classes = 5)
resnet50_model.compile(keras.optimizers.Adam(lr=0.0001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
from keras.callbacks import ReduceLROnPlateau
learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc',
patience=3,
verbose=1,
factor=0.5,
min_lr=0.000001)
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('./data/train',
target_size = (150, 150),
batch_size = 16)
test_set = test_datagen.flow_from_directory('./data/test',
target_size = (150, 150),
batch_size = 16)
learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc', patience=3, verbose=1, factor=0.5, min_lr=2.5e-5)
history = resnet50_model.fit_generator(training_set,
steps_per_epoch = 3888 //16,
epochs = 100,
validation_data = test_set,
validation_steps = 435 // 16,
callbacks=[learning_rate_reduction])
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epochs')
plt.legend(['train', 'test'])
plt.show()
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epochs')
plt.legend(['train', 'test'])
plt.show()
Unfortunately, the accuracy on the validation set hasn't changed. Therefore, it will be investigated which plant species had the most problems.
test_set = test_datagen.flow_from_directory('./data/test',
target_size = (150, 150),
batch_size = 435,
shuffle = False)
## copied from source[7]
predictions = resnet50_model.predict_generator(generator=test_set, steps=1)
y_pred = [np.argmax(probas) for probas in predictions]
y_test = test_set.classes
class_names = test_set.class_indices.keys()
from sklearn.metrics import confusion_matrix
import itertools
def plot_confusion_matrix(cm, classes, title='Confusion matrix', cmap=plt.cm.Blues):
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
plt.figure(figsize=(10,10))
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=45)
plt.yticks(tick_marks, classes)
fmt = '.2f'
thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, format(cm[i, j], fmt),
horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")
plt.ylabel('True label')
plt.xlabel('Predicted label')
plt.tight_layout()
# compute confusion matrix
cnf_matrix = confusion_matrix(y_test, y_pred)
np.set_printoptions(precision=2)
# plot normalized confusion matrix
plt.figure()
plot_confusion_matrix(cnf_matrix, classes=class_names, title='Normalized confusion matrix')
plt.show()
All plant species except the sunflower and the rose were well classified. The sunflower and the marguerite had problems with each other, which is quite surprising, as these two plant species do not look very similar.
In the next cell all wrongly classified pictures are plotted with predicted and actual labels.
x,y = test_set[0]
class_name = list(test_set.class_indices)
pictrues_length = int(round((len(mis_class)/4),0))
fig,ax=plt.subplots(pictrues_length,4)
fig.set_size_inches(15,8*15)
count=0
for i in range (pictrues_length):
for j in range (4):
ax[i,j].imshow(x[mis_class[count]])
ax[i,j].set_title("Predicted Flower : "+class_name[np.argmax(y[mis_class[count]])]+"\n"+"Actual Flower : "+class_name[y_pred[mis_class[count]]])
plt.tight_layout()
count+=1